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(54) Text improvement 

(57) The invention relates to a method and device 
for text improvement applicable after a format conver- 
sion. First the text parts are detected by a block growing 
process in a source image containing text and graphics 



or photographs. Then the text part is scaled to the de- 
sired size by standard interpolation techniques. With 
postprocessing including thresholding and morphologi- 
cal filtering the character edges are sharpened to im- 
prove the visual appearance. 
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Description 

[0001] The invention relates to a method and device for text improvement. 

[0002] Nowadays digital display devices are more and more frequently matrix devices, e.g. Liquid Crystal Displays, 
5 where each pixel is mapped on a location of the screen having a one to one relationship between raster data and 
display's points. This technology implies the usage of a scaling system to change the format of the input video/graphic 
signal so that it satisfies the size of the device, i.e. the number of its pixels. The scaling block is based on a filter bank 
that performs pixel interpolation when the zooming factor is varying. Actually available solutions on the market apply 
an undifferentiated processing on the graphic raster that leads to results with unavoidable artifacts. Usually low-pass 
10 filters reduce pixellation, also know as the seesaw effect on diagonals, and prevent the signal to suffer from aliasing 
due to the sub-sampling, but they also introduce other annoying effects such as blurring the images. It depends on the 
content of the displayed signal the relevance of the perceived artifacts and the kind of artifacts that have to be preferred 
as unavoidable. 

[0003] It is, inter alia, an object of the invention to provide a simple text improvement. To this end the invention 
15 provides a text improvement as defined in the independent claims. Advantages embodiments are defined in the de- 
pendent claims. 

[0004] Starting from the above-mentioned observations, a novel technique is provided here that is able to take into 
account the image content and to apply an ad hoc post-processing only where it is required. So, in accordance with 
the present invention, text improvement is based on detection. The processing is active only in presence of text region. 
20 a viable area of application of this invention is the text readability improvement in the case of LCD devices, when, and 
it is usually the case, we do not want to affect other parts of the displayed signal. 

[0005] A remarkable characteristic of the technique presented here is its really low computational complexity. This 
aspect determines a high effectiveness in terms of cost/performances ratio. In fact the insertion of the proposed algo- 
rithm into the other circuitry that carries out all the digital processing needed for resize the matrix display device input, 
25 presumably rises the display quality, according with the average user perception, without affecting considerable its cost. 
[0006] These and other aspect of the invention will be apparent from and elucidated with reference to the embodi- 
ments described hereinafter. 
[0007] in the drawings: 

30 Figs. 1-3 illustrate the operation of a morphological filter; and 

Fig. 4 shows a block diagram of a system in accordance with the present invention. 

[0008] The invention proposes the design of a text detection algorithm, together with a post-processing block, for 
text enhancement. It will be shown that the invention significantly improves the performance in terms of content read- 
35 ability and leads to good perceptual results of the whole displayed signal, while keeping really low the computational 
complexity of the total scaling system. 

[0009] The organization of the remainder of this document is as follows. First the general scaling problem and the 
current available algorithms are briefly summarized. Thereafter concepts concerning format conversion by a non-in- 
teger factor will be introduced. Successively the post-processing block, characterized by the thresholding operation 
40 and the morphological filter, will be summarized and its features will be described. Finally the text search strategy will 
be presented and the detection algorithm and its cooperation with the previously introduced post-processing block will 
be elucidated. 

The general framework 
45 " ~" 

[0010] Resizing pictures into a different scale requires format conversion. This operation involves well-known re- 
sampling theory and classical filter procedures are currently used to accomplish with it. Filters avoid aliasing problems 
in frequency, freeing room for the repetitions introduced by the sampling operation in the original domain. Among the 
interpolation filter families polynomial interpolators of first order are commonly used in which the reconstructed pixel 

50 is a weighted mean of the nearest pixels values. These kinds of filters are also called Finite Impulse Response filters. 
[0011] Inside standard display devices the format conversion problem is usually faced with linear filtering too. A 
particularly simple class of F.I.R. filters reconstructs pixels in between two available ones tacking the value on the line 
joining these two adjacent points. There are many other possible techniques. For example pixel repetition, or polynomial 
interpolation with more complexes weighting functions. The quality perception of images processed with these different 

55 solutions is generally not really high, there are impairments and artifacts that are not completely avoidable. This con- 
sideration implies that some compromise is due in order to reach an acceptable or, at the best, a satisfactory cost/ 
performance ratio. 

In the past the simplest solution solved the problem using pixel repetition. A more recent solution, see the Philips scaler 
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PS6721, still uses linear filtering but with a slight different shape of the impulse response, to improve the transition 
steepness. 

[001 2} Measuring the rise time of the step response is a classical way to assess the performance of the interpolator 
in presence of an edge. In fact low pass filters affect edge steepness and a smooth steepness is perceived as a blurring 
5 effect. 

[0013] Moreover the actual impact of this annoying artifact depends on the kind of displayed signal. Actually in case 
of natural images a blurring effect could be tolerated in a certain measure. 

[0014] Whereas for artificial pattern a slight smoothness effect is recommended only when the content requires 
approaching a natural impression (this is the case of virtual reality and 3D games). In this case filtering is used as an 

io anti-aliasing process. For the same reason these kinds of filters are used on text/characters to avoid the pixellation 
effect, also know as the seesaw effect on diagonals. Interpolation filters are anti-aliasing filters too, because they reduce 
the highest frequency of the input signal. Moreover, supposing to have a black text on a white background : the amount 
of gray levels introduced by this kind of filters should be a less percentage of the black quote. If it is not the case, we 
have an artifact instead of a picture improvement, and images are perceived as blurred. For instance, when bilinear 

15 interpolation as well more sophisticated filters like the bicubic ones, are used on small characters (the commonly used 
size 10*12 points) and thin lines, they appear defocused. In all these cases it seems better to use no filters at all, at 
least no low pass filters as are actually available. 

[001 5] Starting from the above consideration we can conclude that, because format conversion requires resampling, 
so that the filtering process is unavoidable, to accomplish with the above issue we have to find out some other solution. 

20 in case of text, a simple idea is to apply a post-processing block after the scaler to clean all the gray levels where 
characters are detected. Because of the scale change, this operation could not be performed using only a simple 
threshold block. In fact threshold is a non-linear operator that introduces not uniform patterns when it converts gray 
levels characters to binary values, another kind of artifact that is highly noticeable. Morphological filters are an inter- 
esting class of operators that are able to change not regular patterns into more regular ones. They will be introduced 

25 in a following section. 

Format conversion by a rational factor 

[0016] In today's digital display devices, images are frequently represented with a matrix of pixels so that a fixed 
30 picture format is required. When a signal with a different format arrives at the input of a matrix display, format conversion 
is unavoidable. Supposing a graphic card had generated the signal, than selecting a different graphic format, instead 
of the one used by the display, depends on the requirement of the software application running. At the moment it is 
not advisable to constraint the graphic card output only with the requirement of the display. 

[0017] We recall that standard today's graphic formats are VGA, SVGA, XGA, SXGA and higher. Format conversion 
35 between these raster sizes requires in almost all the cases a rescaling by a rational factor. This, by itself, leads to a 
sensible degradation of the resampled picture. In fact when, for example, we need to change format from VGA at 
display's input to XGA at display's output, the factor involved would be 8/5, equal to 1 .6 times the size of the original 
picture. This format conversion ratio would clearly require a sub-pixel resolution, but with standard linear filtering tech- 
niques this is not be allowed without paying a high blurring cost. 
40 [0018] Let s(/J) be the input signal at position (i, j) in the input grid and s(7J) the signal after format conversion at 
position (7J) in the thicker output grid. Supposing to have a rescaling from VGA to XGA, i.e. by an 8/5 factor, every 5 
pixels at the input of the sampler there will be 8 pixels at its output. A rescaling by a rational factor conceptually relies 
on an intermediate "super-resolution" grid obtained using a zooming factor equal to the numerator, in the example 8. 
In this case the "super-resolution" grid will be eight times thicker than the original one. Tacking two input values, s(/, y) 
45 and s(i + 1 , j), on the same line / at position / and /" + 1 , the interpolated value s(7J) will be positioned in between the 
two original values in the super-resolution gird, i.e. in one of the eight possible positions available on the grid. We 
express this fact by the following equation: 

§(/+£,/) = wvs(M)+ *V S ('' +1 '» v* e[0...7] 

50 O 

Where, for the linear interpolator, 
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|w2 = 1 - £ 

k is the position of the pixel in the dense grid, the position is also called filter phase. In a linear filter is 8 « k, and 8 is 
the distance between the pixel to be interpolated respect to the two adjacent original ones. The signal at the output 
grid will be obtained tacking values on a grid 5 times weaker. The sub-sampled signal at the output is expressed as it 
follows: 



s*(* + S-^J)-*>i + * e[0...7] 



Because the output grid is not a multiple of the input grid, often original pixel value will be lost and they will be replaced 
with an average value according with what we said above. If the input pattern would be a black and white text, its pixels 
will be frequently replaced by a weighted average of their values, a gray level. 

Text improvement via thresholding 

[0019] A threshold operator placed at the output of the scaling filter will recover a black and white pattern, or more 
generally a bicolor one, choosing the threshold nearest value according to the following relationship: 



*iO" + j-*../)-f* if s'(i + -k 9 j)<3 

8 8 

*s (' +7 • kj) = /„ if s* (i + — • k,j) > S 



Where i k is the black level and t w \he white one. 

[0020] We could notice that, in case of black/white and bicolor patterns, the threshold function could be integrated 
in the filter operator, setting l k and l w in accordance with the actual filter phase. In this way the threshold operation 
recover original bicolor levels from the interpolated ones according with theirs new positions. In regions where the 
amount of gray levels introduced is too height, this simple operator improves the sharp edge perception. Anyway this 
is paid with the introduction of irregular patterns. In the next section we will see how this problem could be solved. 

Morphological filtering algorithms 

[0021 ] The introduction of mathematical morphology to solve the problem of text deblurring is due to the fact that a 
morphological filter, working both as a detector and as a non-linear operator, is able to eliminate gray levels without 
destroying the character regularity. Moreover, in case of bicolor patterns, is able to recover a specified regularity where 
required. 

In general the detector, called structuring element, is a small matrix (usually 2x2 or 3x3); it can recognize a particular 
pattern on the data, in our case the rasterized image's pixels at the display output, and to substitute that pattern with 
a different set of requested values. Supposing to use the morphological filter after the threshold block, on a bi-leve! 
pattern, the structuring element will work as a binary mask on the underlying data performing a set of logical operations 
between the bit of the running matrix and the bit of the scanned data. An output equal at 1 will signify that a specified 
pattern has been identified. 

[0022] A particular operator belonging to the morphological filter family, also called "diagonal" filter, applies the fol- 
lowing set of logical operations to the data: 
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Y= X 4 v (P 1 v P 2 u P 3 o P 4 ) 

P, = (X* 4 nX 7 n/ 6 n X 3 ) 
P 2 = (X C 4 n X 3 o X*o n X^ 



P 3 = (X°4 o X n n X° 2 o X 5 ) 
P 4 = (X c 4 o X 5 n n X 7 ) 

Here, X 0 — X 8 is the set of data currently analyzed by the structuring element; besides, in case of binary data, u is the 
classic logical OR operator and n is the classic logical AND operator. The structuring element orders the data in its 
framework as shown in Fig. 1 . 

[0023] The output, y, after the set of logical operations above introduced, replaces the previous value at the origin 
20 . of the data matrix, X 4 in the figure. You can notice that, if the result of P 1 u P 2 u P 3 u P4 is 0, than X 4 remains 
unchanged, instead if the result is 1 than X 4 is always replaced by 1 . 

[0024] Looking carefully, it will be evident that the set P 1 , P 2 , P 3 , P 4 of logical operations corresponds to the detection 
of the patterns shown in Fig. 2. Patterns in Fig. 2 are diagonal patterns of black and white pixels, in case of binary 
images. According with the above relations, when one of these configurations is found, than, in the origin of the detected 
25 region identified by a circle in the figure, a 0 value is substitute by a 1 . This operation, in terms of pattern effect, fills 
holes in diagonal configurations. 

[0025] One could notice that the same operation could be done, instead of using logical operators, with a LUT ad- 
dressed by the configuration of bits in the structuring element. Supposing to order the cells of the element according 
with the above figure, this configuration has the following address: 

30 

L ^address = *8 *V *6 *5 *4 X 3 *2 *1 *0 

where each Xj is correspondingly equal at 1 orO according with the value in the I th position of the matrix. To fill holes, 
35 the LUT at position XXX1 0X01 X, 

01 X1 OXXXX X1 0X01 XXX, XXXX01X1 0, will be set at 1 , in all the other position it will be set at 0. Here X means don't 
care. 

[0026] From a conceptual point of view, because of the holes filling function of the "diagonal" structuring element, 
the set of operations described above on a 3x3 structuring element, are equivalent at changing a diagonal patterns, 
40 anyhow oriented in a 2x2 matrix, with a uniform block. This concept is clarified in Fig. 3. 

Block diagram of system embodiment 

[0027] The drawing in Fig. 4 shows a block diagram of the total system in which the main concept of the architecture 
45 for the detector and the post processing block are sketched. An input image Inlm s is applied to a Search Window part 
SW and Text Detector part Det. The input image Inlm s, possibly modified in some region by the text detector part Det, 
is applied to a scaler Seal, if recognized as text by the text detector part Det, such as the commercially available scaler 
PS6721 . The scaled image from the scaler Seal is applied to a post-processing part Post-proc that produces the output 
image Outlm s\ 

50 

Search Window and Text Detector 

[0028] Detector is a key operator. In fact it depends on it if the input signal will be binarized and further processed 
or simply filtered with the linear scaler. According to what it was previously said, detection is specifically designed to 
55 recognize text patterns. When the required constraint imposed at the detector are not satisfied, the signal does not 
eventually benefit of this further processing step. Detection is performed with a local sensor that recognizes the amount 
of colors in a small region. So in principle it works as a search window that scan the raster image to discover text areas. 
[0029] To design it, satisfying a low memory cost, a fixed vertical width was used, equal to 3 lines on the domain of 
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the original signal. Instead its horizontal depth is varying according to the image characteristics and it is based on a 
simple growth criterion defined using some intuitive assumptions on the text properties. Currently assumptions on 
graphic text are as it follows: 

5 1. A text area is a two-color region in which text is one color and the other color is the background. 

2. In a text area a text color is perceptually fewer present than a background color. 

3. A text region has a reasonable horizontal extension. 

[0030] These assumptions determine the constraints on the patterns the detector recognize as text regions. As we 
10 can see neither filtered text nor not-uniform background are recognized as text regions. This is a reasonable assumption 
because the threshold operator in these cases would introduce more artifacts than benefits. Furthermore the not bal- 
anced percentage of the two colors prevents the detector from identifying as text two color regions with potentially 
dangerous patterns. An example is the chess pattern, quite recurrent, for example in window folder background. Finally 
the third condition prevents to identify as text regions small bicolor fragment of the raster signal, that could be presum- 
es ably border or other small pieces of graphic objects. 

[0031] The conditions introduced above are used to define some parameters to adjust the behavior of the detector 
such that it could reach the best performances. Let we considerthe above introduced input raster signal s(r t c) at position 
(r.c). The search window will be indicated with q(r,c), with (r, c) being the coordinates of the block's origin that identify 
its reference pixel in the image; the relative coordinates, identifying a cell in the search window, are referred to the 
20 block origin and they will be noted as (/, /). Furthermore the detector height and width will be indicated with h and w. 
Whereas w is a varying parameter, on thecontrary h is fixed, to satisfy line memory constraints, and its value is currently 
h = h = 3. 

[0032] Let be rV c the number of colors detected in the search window. According with the previously described block 
growing, the width w will increase following this search strategy: 

25 

(w(k + 1) = w(k) + 1 if N c <2 
l>v(* + l) = w(*) = vv if N c >2 



N c >2 \s the exit condition from the growing search strategy. When the exit condition is verified, the system will return 
the final block width w. 

[0033] Together with the block growing process, two color counters will be incremented at each new step k. It could 
be notice that a step k corresponds to the evaluation of a new input pixel in the horizontal direction. Calling y 1 the 
number of pixels with color c 1 and y 2 the number of pixels with color c 2> the counters will be upgraded according with 
the corresponding block growing step in the following manner: 



fri(r + l)«ri(r) + l if q(i + w(k + l),j + h) = c x for /i = 1...3 
Vi ( T + 1) = Y\ O) otherwise 

and 
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fr 2 (r + l) = r 2 (0 + l if q(i + w{k + \\j + h) = c, for h = l. 
[xi ( r + 0 = Yi ( r ) otherwise 



1 = 3- iv(7c+1 ) + h is a new counting step in the search window, a new pixel evaluated using the growing window at item k 
[0034] Finally let we introduce the last parameter representing the ratio between the two colors counters, once 
the background is identified, according to the following relationship: 



6 
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4 = — ^ r^r 2 

£ = ^ if Yx<Y2 



c, = background 
c 2 = background 



10 [0035] Once the algorithm is exited from the search strategy, the detection window is available to identify its content. 
[0036] As mentioned above, a first condition to be satisfied, so that a region would be recognized as text, is that the 
block has a reasonable extension. Let be: 



15 



20 



e = mm w 



the minimum value, in terms of pixels, allowed for a region to be recognized as text region. The condition to be satisfied 
by a text region will be: 



The current value fixed for the parameter e is: e = 300 . 

[0037] Recalling that £ is the ratio between the background and the text colors, a second condition to be satisfied, 
25 so that the block would be recognized as a text area will be: 



30 where £ is a modifiable parameter actually fixed as £ = 1 .2. In other words: 
if 

^ < | => q\\ not a text window 

35 

[0038] The block will be discarded as not a text block when one of the above conditions are not satisfied. The new 
search window will be q(r,c + w) and it will start at position (r, c + w) in the original image, or (r + 3, c) depending if in 
the previous step the end of line was reached. 

[0039] Following this strategy the entire image will be scanned by the search window and text region wilt be detected. 

40 As text is detected the previously described post-processing operations will be applied. 

[0040] Going back to Fig. 4, an input image is first subjected to a block-growing process BIGr based on whether the 
number of different colors does not exceed 2 (N c <> 2), a first indication for the presence of text. As soon as the number 
of colors exceeds 2, the block growing process BIGr is stopped, and the other parameters Outpar are determined, 
which represent the three criteria for text listed above. Based on these parameters Outpar, it is determined whether 

45 there is a text region (Txt reg ?). If so, the background color c background is set to white, and the text color c tcxt is set to 
black. 

[0041] The resulting image is subjected to the scaling operation SCAL. 

[0042] After the scaling operation SCAL, the text region is subjected to a thresholding operation (threshold the 
output of which is applied to a morphological filter (Morph. Filt.). Thereafter, white is set back to the background color 
50 c background. and b,ack is set back t0 the text co,or c texv The resu,t of this operation forms the output image Outlm s* 
that is displayed on a matrix display D. 

[0043] A primary aspect of the invention can be summarized as follows. A novel technique is suggested able to take 
into account the image content and to apply an ad-hoc scaler post-processing only where it is requested. A viable area 
of application of this invention is the text readability improvement in the case of LCD devices, when, and it is usually 
55 the case, we do not want to affect other part of the displayed signal. It is, inter alia, an object of the invention to provide 
an ad hoc simple text detector. The invention proposes the design of a text detection algorithm, together with a post- 
processing block, for text enhancement. The invention significantly improves the performance in terms of content read- 
ability and leads to good perceptual results of the whole displayed signal, while keeping really low the computational 



7 




EP1 117 072 A1 



complexity of the total scaling system. The invention is preferably applied in LCD scaler ICs. 

[0044] It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that 
those skilled in the art will be able to design many alternative embodiments without departing from the scope of the 
appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting 

5 the claim. The word "comprising" does not exclude the presence of elements or steps other than those listed in a claim. 
The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention 
can be implemented by means of hardware comprising several distinct elements, and by means of a suitably pro- 
grammed computer. In the device claim enumerating several means, several of these means can be embodied by one 
and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims 

10 does not indicate that a combination of these measures cannot be used to advantage. 
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15 
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Claims 

30 i. A method of text improvement, the method comprising the steps of: 
detecting (SW, Det) text in an image; and 

processing (Post-proc) the image in dependence on a result of the text detecting step. 

35 2. A method as claimed in claim 1 , wherein the detecting step (SW, Det) comprises the step of determining (BtGr) a 
region for which it holds that the number of colors does not exceed 2. 

3. A method as claimed in claim 1 , wherein the detecting step (SW, Det) comprises the step of determining whether 
a text color is fewer present than a background color. 

40 

4. A method as claimed in claim 1 , wherein between the detecting step (SW, Det) and the processing step (Post- 
proc), a scaling step (Seal) is carried out to adjust first numbers of pixels per line and lines per image of the image 
to second numbers of pixels per line and lines per image that fit in with a display (D) on which the image is displayed. 

45 5. a method as claimed in claim 4, wherein the processing step (Post-proc) comprises the step of subjecting a scaled 
image to a thresholding operation. 

6. A method as claimed in claim 4, wherein the processing step (Post-proc) comprises the step of subjecting a scaled 
image to a morphological filtering. 

50 

7. A method as claimed in claim 4, wherein the detecting step (SW, Det) comprises the step of setting a background 
color (c background ) to white, and a text color (c text ) to black, and the processing step (Post-proc) comprises the step 
of setting white back to the background color (c^c^ro^), and black back to the text color (Ct ext ). 

55 8. A device for text improvement, the device comprising: 

means for detecting (SW, Det) text in an image; and 

means for processing (Post-proc) the image in dependence on a result of the text detecting means. 
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9. A display apparatus, comprising: 

a device for text improvement as claimed in claim 8; and 
a display (D). 
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